Detection of anomalous program execution using hardware-based micro architectural data

ABSTRACT

Disclosed are devices, systems, apparatus, methods, products, media and other implementations, including a method that includes obtaining hardware-based micro-architectural data, including hardware-based micro-architectural counter data, for a hardware device executing one or more processes, and determining based, at least in part, on the hardware-based micro-architectural data whether at least one of the one or more processes executing on the hardware device corresponds to a malicious process. In some embodiments, determining based on the hardware-based micro-architectural data whether the at least one of the one or more processes corresponds to a malicious process may include applying one or more machine-learning procedures to the hardware-based micro-architectural data to determine whether the at least one of the one or more processes corresponds to the malicious process.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a continuation application of U.S. patentapplication Ser. No. 14/778,007, filed Sep. 17, 2015, which is a § 371National Stage of International Application No. PCT/US2013/068451 filedon Nov. 5, 2013, which claims the benefit of, and priority toprovisional U.S. application Ser. No. 61/803,029, entitled “SYSTEMS ANDMETHODS TO DETECT ANOMALOUS PROGRAM EXECUTION USING PROCESSORMICROARCHITECTURAL EVENTS,” and filed Mar. 18, 2013, the contents ofwhich are incorporated herein by reference in their entireties.

STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH

This invention was made with government support under grant FA8750-10-2-0253 awarded by the Air Force Research Laboratory, InformationDirectorate. The government has certain rights in the invention.

BACKGROUND

The proliferation of computers in a particular domain is generallyfollowed by the proliferation of malicious processes (e.g., malware) inthat domain. For example, systems that include the latest Androiddevices are laden with viruses, rootkits spyware, adware and otherclasses of malicious processes. Despite the existence of anti-virussoftware, malware threats (as well as threats from other types ofmalicious processes) persist and are growing. Unfortunately, there existmyriad ways to subvert commercial anti-virus software, including simplydisabling the anti-virus. Furthermore, malware can mutate into newvariants, which makes static detection of malware difficult.

Examples of some common malware processes are provided below:

Malware Brief Description Worm Malware that propagates itself from oneinfected host to other hosts via exploits available on the surface(system call interfaces) of the operating system. Virus Malware thatattaches itself to running programs and spreads itself through users'interactions with various systems. Polymorphic Virus A virus that, whenreplicating to attach to a new target, alters its payload to evadedetection, i.e., takes on a different shape but performs the samefunction. Metamorphic Virus A virus that, when replicating to attach toa new target, alters both the payload and functionality, including theframework for generating future changes. Trojan Malware that masqueradesas non-malware and acts maliciously once installed (opening backdoors,interfering with system behavior, etc.) AdWare Malware that forces theuser to deal with unwanted advertisements. SpyWare Malware that secretlyobserves and reports on users computer usage and personal informationaccessible therein. Botnet Malware that employs a user's computer as amember of a network of infected computers controlled by a centralmalicious agency. Rootkit A malware package that exploits security holesin the operating system to gain superuser access. Usually, a rootkitattempts to hide its existence while performing malicious superuseractivities by tampering with the file system.

Malicious processes, such as malware, were originally created to attainnotoriety or for fun, but today malware deployment is mostly motivatedby economic gains. There are reports of active underground markets forpersonal information, credit cards, logins into sensitive machines inthe United States, etc. Also, malicious processes such as malware havebeen developed to target specific computers for industrial espionagepurposes and/or for sabotage.

SUMMARY

The devices, systems, apparatus, methods, products, media and otherimplementations disclosed herein include a method including obtaininghardware-based micro-architectural data, including hardware-basedmicro-architectural counter data, for a hardware device executing one ormore processes, and determining based, at least in part, on thehardware-based micro-architectural data whether at least one of the oneor more processes executing on the hardware device corresponds to amalicious process.

Embodiments of the method may include at least some of the featuresdescribed in the present disclosure, including one or more of thefollowing features.

Obtaining the hardware-based micro-architectural data may includeobtaining the hardware-based micro-architectural data at various timeinstances.

Obtaining the hardware-based micro-architectural data at the varioustime instances may include performing one or more of, for example, adata push operation initiated by the hardware device to send themicro-architectural data, and/or a data pull operation, initiated by anantivirus engine, to send the micro-architectural data.

Obtaining the hardware-based micro-architectural data may includeobtaining multi-core hardware-based micro-architectural data resultingfrom execution of the one or more processes on a processor device withmultiple processor cores, and correlating the respective hardware-basedmicro-architectural data obtained from each of the multiple processorcores to the one or more processes.

Determining based on the hardware-based micro-architectural data whetherthe at least one of the one or more processes corresponds to a maliciousprocess may include applying one or more machine-learning procedures tothe hardware-based micro-architectural data to determine whether the atleast one of the one or more processes corresponds to the maliciousprocess.

Applying the one or more machine-learning procedures to thehardware-based micro-architectural data to determine whether the atleast one of the one or more processes corresponds to the maliciousprocess may include matching the obtained hardware-basedmicro-architectural data to previously identified patterns ofhardware-based micro-architectural data associated with one or moremalicious processes.

The method may further include obtaining updates for the previouslyidentified patterns of hardware-based micro-architectural dataassociated with the one or more malicious processes.

Obtaining the updates may include downloading encrypted data for thepreviously identified patterns of hardware-based micro-architecturaldata associated with the one or more malicious processes to an antivirusengine in communication with the hardware device providing thehardware-based micro-architectural data, decrypting at the antivirusengine the downloaded encrypted data for the previously identifiedpatterns of hardware-based micro-architectural data associated with theone or more malicious processes, and updating a revision countermaintained by the antivirus engine indicating a revision number of amost recent update of the previously identified patterns ofhardware-based micro-architectural data.

The one or more machine learning procedures may include one or more of,for example, a k-nearest neighbor procedure, a decision tree procedure,a random forest procedure, an artificial neural network procedure, atensor density procedure, and/or a hidden Markov model procedure.

The malicious process may include one or more of, for example, a malwareprocess, and/or a side-channel attack process.

The hardware-based micro-architectural data may include one or more of,for example, processor load density data, branch prediction performancedata, and/or data regarding instruction cache misses.

In some variations, a system is provided that includes a hardware deviceexecuting one or more processes, and an antivirus engine incommunication with the hardware device. The antivirus engine isconfigured obtain hardware-based micro-architectural data, includinghardware-based micro-architectural counter data, for the hardware deviceexecuting the one or more processes, and determine based, at least inpart, on the hardware-based micro-architectural data whether at leastone of the one or more processes executing on the hardware devicecorresponds to a malicious process.

Embodiments of the system may include at least some of the featuresdescribed in the present disclosure, including at least some of thefeatures described above in relation to the method, as well as one ormore of the following features.

The antivirus engine configured to obtain the hardware-basedmicro-architectural data may be configured to obtain the hardware-basedmicro-architectural data at various time instances.

The antivirus engine configured to obtain the hardware-basedmicro-architectural data at the various time instances may be configuredto receive the micro-architectural data in response to one or more of,for example, a data push operation initiated by the hardware device,and/or a data pull operation initiated by the antivirus engine.

The antivirus engine configured to determine based on the hardware-basedmicro-architectural data whether the at least one of the one or moreprocesses corresponds to a malicious process may be configured to applyone or more machine-learning procedures to the hardware-basedmicro-architectural data to determine whether the at least one of theone or more processes corresponds to the malicious process.

The antivirus engine configured to apply the one or moremachine-learning procedures to the hardware-based micro-architecturaldata to determine whether the at least one of the one or more processescorresponds to the malicious process may be configured to match theobtained hardware-based micro-architectural data to previouslyidentified patterns of hardware-based micro-architectural dataassociated with one or more malicious processes.

The antivirus engine may further be configured to obtain updates for thepreviously identified patterns of hardware-based micro-architecturaldata associated with the one or more malicious processes.

In some variations, a computer readable media storing a set ofinstructions executable on at least one programmable device is provided.The set of instructions, when executed, causes operations includingobtaining hardware-based micro-architectural data, includinghardware-based micro-architectural counter data, for a hardware deviceexecuting one or more processes, and determining based, at least inpart, on the hardware-based micro-architectural data whether at leastone of the one or more processes executing on the hardware devicecorresponds to a malicious process.

Embodiments of the computer readable media may include at least some ofthe features described in the present disclosure, including at leastsome of the features described above in relation to the method and thesystem.

In some variations, an apparatus is provided. The apparatus includesmeans for obtaining hardware-based micro-architectural data, includinghardware-based micro-architectural counter data, for a hardware deviceexecuting one or more processes, and means for determining based, atleast in part, on the hardware-based micro-architectural data whether atleast one of the one or more processes executing on the hardware devicecorresponds to a malicious process.

Embodiments of the apparatus may include at least some of the featuresdescribed in the present disclosure, including at least some of thefeatures described above in relation to the method, the system, and thecomputer readable media.

Unless defined otherwise, all technical and scientific terms used hereinhave the same meaning as commonly or conventionally understood. As usedherein, the articles “a” and “an” refer to one or to more than one(i.e., to at least one) of the grammatical object of the article. By wayof example, “an element” means one element or more than one element.“About” and/or “approximately” as used herein when referring to ameasurable value such as an amount, a temporal duration, and the like,is meant to encompass variations of ±20% or ±10%, ±5%, or +0.1% from thespecified value, as such variations are appropriate to in the context ofthe systems, devices, circuits, methods, and other implementationsdescribed herein. “Substantially” as used herein when referring to ameasurable value such as an amount, a temporal duration, a physicalattribute (such as frequency), and the like, is also meant to encompassvariations of ±20% or ±10%, ±5%, or +0.1% from the specified value, assuch variations are appropriate to in the context of the systems,devices, circuits, methods, and other implementations described herein.

As used herein, including in the claims, “or” or “and” as used in a listof items prefaced by “at least one of” or “one or more of” indicatesthat any combination of the listed items may be used. For example, alist of “at least one of A, B, or C” includes any of the combinations Aor B or C or AB or AC or BC and/or ABC (i.e., A and B and C).Furthermore, to the extent more than one occurrence or use of the itemsA, B, or C is possible, multiple uses of A, B, and/or C may form part ofthe contemplated combinations. For example, a list of “at least one ofA, B, or C” may also include AA, AAB, AAA, BB, etc.

As used herein, including in the claims, unless otherwise stated, astatement that a function, operation, or feature, is “based on” an itemand/or condition means that the function, operation, function is basedon the stated item and/or condition and may be based on one or moreitems and/or conditions in addition to the stated item and/or condition.

Details of one or more implementations are set forth in the accompanyingdrawings and in the description below. Further features, aspects, andadvantages will become apparent from the description, the drawings, andthe claims.

BRIEF DESCRIPTION OF THE DRAWINGS

These and other aspects will now be described in detail with referenceto the following drawings.

FIG. 1 includes example illustrations of graphs of hardwaremicro-architectural activity for several different processes.

FIG. 2 is a schematic diagram of an example system to detect maliciousprocesses

FIG. 3 is a flow chart of an example procedure to detect maliciousprocesses.

FIG. 4 is a schematic diagram of an example system in which an AV engineis implemented.

FIG. 5 is a table of Android malware families, and detection resultstherefor, tested using, for example, the systems and procedures of FIGS.2-4.

FIG. 6 is a graph illustrating accuracy of various binary classifierswhen applied to micro-architectural data produced, in part, by Androidmalware.

FIG. 7 includes graphs showing the accuracy of classifiers when appliedto rootkits processes.

FIG. 8A is a schematic diagram of an example security update payload.

FIG. 8B is a flowchart of an example procedure to receive a securityupdate payload and update the configuration of an AV engine.

Like reference symbols in the various drawings indicate like elements.

DESCRIPTION

Described herein are systems, devices, apparatus, computer programproducts, and other implementations for detection of anomalous programexecution processes, such as malware. In some implementations,hardware-based micro-architectural data, including hardware-basedmicro-architectural counter data (e.g., from hardware-based performancecounters) is obtained from a hardware device (such as aprocessor/computing-based device), and analyzed (e.g., to analyze thetemporal behavior of executing processes that resulted in themicro-architectural data) using machine-learning procedures (e.g.,classification procedures), to identify malicious processes from one ormore processes executing on the hardware device being monitored

Generally, processes executing on a hardware-implemented controllerdevice (be it a general-purpose processor, an application-specificcontroller, etc.) exhibit phase behavior. A process (be it a maliciousor non-malicious process) that is configured to achieve a particularfunctionality may perform activity A for a while, then switch toactivity B, then to activity C. Although such a process may alternate inthe exact order of performance of the activities, typically the processwould need to perform activities A, B, and C to accomplish itsparticular functionality. The activity phases that correspond to aparticular process typically correspond to patterns in architectural andmicro-architectural events. Additionally, different processes (e.g.,different programs configured to perform different functionalities)result in different hardware-based micro-architectural behavior. Forexample, FIG. 1, providing example illustrations of micro-architecturalactivity graphs for several different processes, show that themicro-architectural behavior for different processes (in the example ofFIG. 1, the processes were from the SPEC benchmark suite) tend to bedifferent, resulting in different hardware micro-architectural traces orpatterns. For example, as shown in FIG. 1, the behavior for the L1exclusive hits and the executed branch instructions monitored for the‘bzip2’ process (as illustrated in graphs 102 and 104) is different fromthe behavior for the L1 exclusive hits and the executed branchinstructions for the ‘me process (illustrated in graphs 112 and 114),which in turn is different from the behavior for the L1 exclusive hitsand the executed branch instructions for the sjeng’ process (illustratedin graphs 122 and 124). It is to be noted that monitoring themicro-architectural behavior may be facilitated by specific built-incounters, and/or may be achieved by measuring/monitoring occurrences ofevent at specific points on the circuits of the hardware device beingmonitored.

Accordingly, processes executing on a hardware device (e.g.,hardware-based controller device) may be distinguished (and thusidentified) based on such time-varying micro-architecturalsignatures/traces. Generally, minor variations in the exactimplementation of a particular process do not significantly affect thegenerated hardware-based micro-architectural traces resulting from theprocess, and, therefore, identifying the process and/or a determiningwhether the process is malicious or not (e.g., via machine learningclassification procedures, heuristic and non-heuristic proceduresanalyzing the micro-architectural data, etc.) can still be performed.This is because regardless of how a malicious process (e.g., malware)writers change the underlying implementation (e.g., the softwareprogram), the semantics of the process do not change significantly. Forinstance, if a piece of malware is designed to collect and log GPS data,then no matter how its writer re-arranges the code, the process willstill have to collect and log GPS data. In other words, the activityphases characterizing the process will generally remain regardless ofthe specific implementation of the process. Additionally, a particulartask that needs to be accomplished will include various sub-tasks thatcannot be significantly modified. For example, a GPS logger will alwayshave to warm up the GPS, wait for signals, decode the data, log it, and,at some future point, exfiltrate the data back to the rogue user(privacy thief) seeking to obtain the data. As a result of thesegenerally invariant operations required to accomplish particular tasksor processes, particular phases of the malicious process' executionremain relatively invariant for different implementation variations.

Thus, hardware-based micro-architectural data (e.g., data from hardwareperformance counters) such as processor load density data, branchprediction performance data, data regarding instruction cache misses,etc., can be used to identify malware and/or other types of maliciousprocesses. Experimental results (more particularly discusses below) showthat the detection techniques/procedures described herein tend to berobust to variations in malware programs (or other types of maliciousprocesses). Thus, after examining a small set of variations within afamily of malware on a processing platform (e.g., Android ARM and IntelLinux platforms), many variations within that family may besubstantially accurately detected. Further, various implementationsdescribed herein enable malicious process detectors, such as thedetectors described herein, to run securely beneath the system software,thus reducing, or all together avoiding, the danger of being turned off.

Accordingly, in some embodiments, methods, systems, devices, products,and other implementations are disclosed that include a method includingobtaining hardware-based micro-architectural data, including, forexample, hardware-based micro-architectural counter data, for a hardwaredevice executing one or more processes, and determining based, at leastin part, on the hardware-based micro-architectural data whether at leastone of the one or more processes executing on the processor-based systemcorresponds to a malicious process. The malicious process beingidentified/detected may include one or more of, for example, a malwareprocess, and/or a side-channel attack process.

With reference to FIG. 2, a schematic diagram of an example system 200to detect and/or resolve malicious processes is shown. The system 200includes an antivirus (AV) engine 210 that comprises, in someembodiments, a performance counter sampling unit (also referred to as a“sampler”) 212, a performance counter database 214 that stores/maintainsrepresentative micro-architectural profiles or signatures (includingperformance counter profiles or signatures) corresponding to variousprocesses (including malware processes), and micro-architectural datacollected by the sampling unit 212, and a classifier 216 configured toanalyze the collected hardware micro-architectural data to determine ifthe one or more processes running on the hardware device beingobserved/monitored includes at least one malicious process (in someembodiments, the classifier 216 may also be configured to moreparticularly identify such a malicious process). The AV engine 210 isgenerally in communication with one or more hardware devices such asprocessor devices 220 and/or 222 shown in FIG. 2.

The sampling unit 212 is configured to obtain hardware-basedmicro-architectural data, including, for example, hardware-basedmicro-architectural performance counter data from the one or morehardware-devices, which may include devices such as controller devices,e.g., processor devices such as the devices 220 and 222, or any othertype of controller devices including controller devices implementedusing modules such as an FPGA (field programmable gate array), an ASIC(application-specific integrated circuit), a DSP processor, etc.Generally, hardware-based controller devices include hardware-relatedperformance counters that may be configured to count a variety of eventssuch as cycles, instructions, cache misses, etc. In someimplementations, these performance counters are used to assist insoftware performance optimization. For example, the Intel For x86processor device implements four (4) configurable performance counters,and the OMAP4460 processor with dual ARM Cortex-A9 cores includes six(6) configurable performance counters. The AV engine 210 is implementedto obtain micro-architectural data (e.g., performance counter data) fromknown controller designs, and as such the AV engine 210 may beconfigured to obtain micro-architectural data from specific knownperformance counters particular to the hardware that is being monitoredby the AV engine. That is, knowledge of the specific architecture of thehardware to be monitored may be required in order to obtain performancecounter data and other micro-architectural data from the performancecounters corresponding to the specific architecture. Examples ofmicro-architectural counters used on an Intel x86 processor architectureinclude:

0x0440—L1D_CACHE_LD.E_STATE;

0x0324—L2_RQSTS.LOADS;

0x03b1—UOPS_EXECUTED.PORT (1 or 2); and

0x7f88—BR_INST_EXEC.ANY.

Examples of common counters (feature event number assignments) on theARM Cortex-A9 cores architecture, through which micro-architectural datacan be obtained, include event numbers:

-   -   0x06—Memory-reading instruction architecturally executed        (counter increments for every instruction that explicitly read        data);    -   0x07—Memory-writing instruction architecturally executed        (counter increments for every instruction that explicitly wrote        data);    -   0x0C—Software change of PC, except by an exception,        architecturally executed (counter does not increment for a        conditional instruction that fails its condition code);    -   0x0D—Immediate branch architecturally executed (counter counts        for all immediate branch instructions that are architecturally        executed);    -   0x0F—Unaligned access architecturally executed (counter counts        each instruction that is an access to an unaligned address); and    -   0x12—Counter counts branch or other change in program flow that        could have been predicted by the branch prediction resources of        the processor.

Additional information on micro-architectural counters that may beimplemented on the ARM Cortex-A9 cores architecture is provided, forexample, at “ARM® Architecture Reference Manual, Arm®v7-A and ARM®v7-Redition, Errata markup,” the content of which is incorporated herein byreference in its entirety.

In some embodiments, the sampling unit 212 may be configured to obtainhardware micro-architectural data (including micro-architecturalperformance counter data) from the counters of the hardware monitoredthrough data push procedures and/or through data pull procedures. Forexample, when pulling data, the AV engine 210 initiates the datacollection, causing hardware targets (e.g., specific hardwareperformance counters implemented in the hardware being monitored) to beaccessed by, for example, interrupting execution of the counters and/orquerying the counters without interruptions. In some embodiments, the AVengine 210 may be configured, e.g., via the sampling module 212, tointerrupts the hardware once every N cycles (where N may be a constantpre-determined number, or may be a varying number, e.g., based on arandom or pseudo-random generator), and sample the variousperformance/event counters, as well as other values (e.g., the currentlyexecuting process' PID). When performing sampling operations using aninterrupt-based procedure, the sampling unit 212 may be configured tosend control signals or otherwise cause the executing hardware to beinterrupted, access the performance counters and/or other storagehardware, and retrieve the values stored on the counters of theinterrupted hardware for further processing by the AV engine 210. Insome embodiments, upon interruption of the hardware and/or the counters,the interrupted hardware may first store data held by its variousperformance counters in a central storage location (e.g., in a statestack), and the data stored at the central storage location may then beaccessed and retrieved by the sampling unit 212. When implementing adata-push sampling mode, data held by the performance counters (and/orother sampling points on the hardware being monitored) may be configuredto be communicated to the AV engine 210 (e.g., to the sampling unit 212)at regular or irregular intervals, with or without interrupting theexecution of the hardware being monitored or of its performancecounters. Thus, in such embodiments, the hardware device to be monitoredis configured to initiate sending the micro-architectural data to the AVengine 210. For example, in a data push mode, the hardware device beingmonitored may be configured to send micro-architectural data withoutneeding to receive a request (e.g., from the sampling unit 212).

The sampling operations implemented by the sampling unit 212 of the AVengine 210 thus obtain time-based data of the output of the varioushardware performance counters (and/or other output points) monitored forone or more processes executing on the hardware being monitored. Asnoted, in addition to micro-architectural data, information such as aprocess' ID (e.g., PID) is also recorded to enableassociating/correlating the micro-architectural data with the processwhose execution resulted in the obtained micro-architectural data. Byalso recording processes' IDs and associating/correlating them with theobtained micro-architectural data, the implementations described hereincan track micro-architectural data resulting from execution of a processacross different hardware devices. For example, in situations where asystem being monitored includes multiple processor cores (each with itsown set of performance counters), where processes/threads may be suspendand resume execution on different cores, maintaining processes' PID'salong with obtained micro-architectural data may enable tracking thebehavior of processes as they switch execution to different hardwaredevices.

In some embodiments, the sampling unit 212 may be realized, at least inpart, on the hardware device being monitored. For example, the samplingunit 212 may be implemented as a hardware realization on a specializedhardware-based controller such as an FPGA, an ASIC, etc.) In someembodiments, the micro-architectural database 212 may be realized, atleast in part, as a software implementation executing on a machine thatincludes a processor-based device that is being monitored by the AVengine 210 to detect malicious processes that are executing on themachine. For example, one of a processor-device's multiplegeneral-purpose cores may be allocated to execute a software realizationof at least part of the AV engine.

As noted the AV engine 210 also includes a micro-architectural database214 configured to store the micro-architectural data obtained from thehardware being monitored/observed, as well as pre-determined data sets,obtained from remote nodes (e.g., servers), that include datarepresentative of micro-architectural signatures/traces of knownmalicious processes (e.g., time-series traces for variousmicro-architectural events or performance counters) and training datathat includes micro-architectural data (e.g., time-based data) fornon-malicious/benign processes. As will be described below in greaterdetails, in some embodiments, the AV engine 210 is periodically (atregular or irregular intervals) updated to include new or modifiedmicro-architectural signature data defining the behavior of new orexisting malicious processes by receiving from a remote nodemicro-architectural signature data.

In some embodiments, the database 214 may be realized, at least in parton the hardware device being monitored. In some embodiments, themicro-architectural database 214 may be realized, at least in part, as asoftware implementation executing on a machine that includes aprocessor-based device being monitored by the AV engine 210 (e.g.,allocating one of a processor-device's multiple general-purpose cores toexecute a software realization of the database 214).

Collection of micro-architectural data (including micro-architecturalperformance counter data) using, for example, the sampling unit 212,and/or storage of the collected data using, for example, the database214, provides a relatively large amount of labeled data that includesmicro-architectural data resulting from execution of malicious processes(e.g., malware) and non-malicious processes. Thus, in some embodiments,the classifier 216 (also referred to as a machine-learning engine) isconfigured to determine whether at least one of the processes withrespect to which the micro-architectural data was collected correspondsto a malicious process (e.g., whether some of the micro-architecturaldata traces collected potentially resulted from execution of the atleast one malicious process) and/or identify the at least one maliciousprocess.

In some implementations, a classifier, such as the classifier 216 of theAV engine 210, may be configured to iteratively analyze training inputdata and the input data's corresponding output (e.g., a determination ofa process type and/or identification of a process corresponding to theinput data), and derive functions or models that cause subsequentmicro-architectural inputs, collected from the hardware being monitored,to produce outputs consistent with the classifier's learned behavior.Such a classifier should be configured to distinguish maliciousprocesses from non-malicious processes.

Generally, machine learning classifiers are configured to examine dataitems and determine to which of N groups (classes) each data itembelongs. Classification procedures can produce a vector ofprobabilities, e.g., the likelihoods of the data item belonging to eachclass. In the case of malicious process detection, two classes may bedefined: malicious process (e.g., malware) and non-malicious process(e.g., non-malware). As a result, the output from classifiers mayinclude probabilities representing the likelihood of a data item beingmalicious. In situations where a particular classifier is not adapted toprocess/classify time-series data (like the time-seriesmicro-architectural data collected by the AV engine 210) this difficultycan be overcome by arranging input data (e.g., corresponding tomicro-architectural events occurring at a particular location of thehardware, such as at a particular counter) that occurred at differenttime instances into a single vector of features that is presented asinput to the classifier. Under this approach, time-based data may beconsolidated into a vector of data, where each vector point correspondsto a micro-architectural sample for a certain counter or location thatoccurred at a different time instance. Additionally and/oralternatively, another approach for processing time-dependent data(micro-architectural data) using classifiers that are generally notconfigured to handle sequences of time-dependent data is to separatelyprocess with such a classifier data points taken for a particularprocess at different time instances, and aggregate the classifier'sresults in order to classify the entire process. In some embodiments,different aggregation operations may be applied to a classifier'sresults, and the aggregation operation that is determined (e.g., throughtesting and experimentation) to yield the best classification resultsmay be used to perform future aggregation operations. For example, oneaggregation operation that may be used is a simple average operation.Another aggregation operation that may be used is a weighted averageoperation in which, for example, data points which are equally probableto belong to each of the various available classes are given zeroweight, whereas data points with high probabilities are given relativelylarge weights.

The types of classifiers that may be used to process/analyze thecollected micro-architectural data points corresponding to the executingprocesses belong to two main classifier categories: linear classifiers,and non-linear classifiers. Linear classification procedures areconfigured to attempt to separate n-dimensional data points by ahyperplane—points on one side of the plane are points of class X andpoints on the other side are of class Y. Non-linear classifiersgenerally do not rely on this type of linear separation. Thus, anyoperation to derive a classification may be applied.

In some of the implementations described herein, non-linear classifierswere used to perform the data processing/analysis operations to reflectthe fact that the data (e.g., micro-architectural data) that was used todetermine whether at least one executing process may be malicious, or toidentify a malicious process, may not necessarily be linearly-separable.Some examples of classifiers, configured to determine if a particularprocess (for which micro-architectural time-based data was collected) ismalicious or non-malicious, that may be used with implementations of theAV engine 210 include:

-   -   K-Nearest Neighbors (KNN)—A KNN classifier is trained by        inserting the training data points along with their labels into        a spatial data structure, like a k-dimensional tree (referred to        as a “k-d-tree”) used for organizing points/data in a        k-dimensional space. In order to classify a data point, that        point's k nearest neighbors (in Euclidean space) are found using        the spatial data structure. The probability that the data point        is of a particular class is determined by how many of the data        point's neighbors are of that class and how far they are from        each other.    -   Decision Tree—Another way to classify data points it to use a        non-spatial tree called a decision tree. This tree is built by        recursively splitting training data into groups on a particular        dimension. The dimension and split points are chosen to minimize        the entropy with each group. These decisions can also integrate        some randomness, decreasing the quality of the tree but helping        to prevent overtraining. After some minimum entropy is met, or a        maximum depth hit, a branch terminates, storing in it the mix of        labels in its group, e.g., 30% malware vs. 70% non-malware. To        classify a new data point, the decision tree traverses the tree        to find the new point's group (leaf node), and returns the        stored mix.    -   Random Forest—One way to increase the accuracy of a classifier        is to use a lot of different classifiers and combine the        results. In a random forest, multiple decision trees are built        using some randomness. When classifying a new data point, the        results of all trees in the forest are weighted equally to        produce a result.    -   Artificial Neural Network (ANN)—A neural network machine        attempts to model biological brains by including neurons which        are connected to each other with various weights. The weight        values between connections can be varied, thus enabling the        neural network to adapt (or learn) in response to training data        it receives. In feed-forward neural nets, input values are        supplied at one edge and propagate through a cycle-less network        to the output nodes. In some embodiments, one input neuron for        each dimension, and two output nodes (e.g., one indicating the        probability that malware is running, one indicating the        probability that non-malware is running) are defined.    -   Tensor Density—this classifier discretizes the input space into        different buckets. Each bucket contains the mix of classes in        the training data set. A data point is classified by finding its        bin and returning the stored mix. Generally, a tensor density        classifier uses O(1) lookup time, and is thus considered to be        time-efficient.

Other classifiers that may be used also include, in some embodiments, asupport vector machine configured to generate, for example,classification functions or general regression functions. In someembodiments, the classifiers may be implemented using regressiontechniques to derive best-fit curves, a classification procedure basedon hidden Markov model, and/or other types of machine learningtechniques. In embodiments in which a hidden Markov model-basedclassifier is used, patterns in the data (e.g., micro-architecturaldata) being processed may be identified using self-similarity analysis,and the transitions in patterns may be used to build the hidden Markovmodel with which malware/goodware can be predicted/classified.Additionally, linear classification techniques like kernel methods whichare capable of accurately classifying data but with reducedcomputational requirements may also be used.

To train the classifiers to identify suspected malicious processes basedon micro-architectural data collected from a hardware-based device thatis to be monitored, in some implementations, a remote system whosehardware configuration may be the same or similar to the hardwareconfiguration of the hardware device with respect to which theprocedures described herein are performed, may execute variants of aknown malicious process (e.g., a malware discovered and/or tracked bysome third party). Micro-architectural data resulting from execution ofthe variants of the particular malware (e.g., represented in a form thatmay be similar to the data used to generate graphs similar to thoseillustrated in FIG. 1) is collected. Periodically, data representativeof the micro-architectural data captured by the remote system may becommunicated to the AV engine 210, and stored on the database 214. Theremote system may also provide micro-architectural data corresponding toknown non-malicious processes. The example micro-architectural datacommunicated by the remote system may be used to train the classifier216 by providing that micro-architectural data and the respectiveidentities and/or type (e.g., malicious or non-malicious) of theprocesses that caused that micro-architectural data to be produced to atleast some of the one or more of the classifiers 216 a-n. The trainingdata will cause the classifiers 216 a-n to be configured (e.g.,dynamically configured) so that upon presenting similarmicro-architectural data (collected from the hardware device to bemonitored) to the classifiers, output consistent with the processestypes/identities of the training data will be produced.

As noted, an AV engine, such as the AV engine 210 of FIG. 2, may berealized entirely in hardware (e.g., implemented as a module on thehardware device that is to be monitored), entirely in software (e.g., asa multi-module application executing on a computing system that includesthe hardware to be monitored), or as a hardware-software combinationimplementation in which one component (e.g., the sampling unit 212 ofFIG. 2) is implemented in hardware, while the database and classifierunits 214 and 216 are implemented via software). If implemented at leastpartly by software, the software components may be configured tocommunicate with the hardware component (e.g., using an interfacingprocedure) to receive data (e.g., micro-architectural data obtained bythe sampling unit) and/or to transmit data or control signals to thehardware-based component.

In addition to being configured to collect and store micro-architecturaldata and analyze collected micro-architectural data to determine whetheror not malicious behavior is occurring (and possibly more particularlyidentify the malicious process(es)), the AV engine 210 is alsoconfigured to take certain actions if a threat is detected (e.g., shutdown the hardware or report the malicious behavior), and update the AVengine with malicious processes definitions and micro-architecturalsignatures. More particularly, there are a wide variety of securitypolicies that can be implemented by an AV engine such as the AV engine210. Some viable security policies include:

-   -   Using the AV engine as a first-stage malware predictor—When the        AV engine suspects a program to be malicious it can run more        sophisticated behavioral analysis on the program. Hardware        analysis happens ‘at speed’ and is significantly faster than        behavioral analysis used by malicious process analysts to create        signatures. Such pre-filtering can avoid costly behavioral        processing for ‘goodware.’    -   Migrating sensitive computation—In multi-tenant settings such as        public clouds, when the AV engine suspects that an active thread        on the system is being attacked (e.g., through a side-channel)        the AV engine can move the sensitive computation. In some        scenarios it may be acceptable for the AV system to simply kill        a suspect process.    -   Using the AV engine for forensics—Logging data for forensics is        expensive as it often involves logging all interactions between        the suspect process and the environment. To mitigate these        overheads, the information necessary for forensics can be logged        only when the AV engine suspects that a process is malicious.    -   Screening for goodware—In some embodiments, the hardware-based        micro-architectural data collected can be used to identity        non-malicious processes, and to corroborate that those processes        are in fact non-malicious. For example, in some implementations,        underlying code samples of processes identified by the AV engine        as non-malicious can be analyzed by, for example, comparing the        code sample to known code samples (that were previously        obtained) corresponding to the processes analyzed. If the        examined underlying code of the executing processes matches the        known code listing previously obtained, the executing process is        confirmed as being non-malicious.

Thus, there is a broad spectrum of actions that can be taken based onthe AV engine's output. The systems and procedures described herein toimplement an AV engine should be flexible enough to implement theabove-described security policies. Conceptually, this means that, insome embodiments, the AV engine should be able to interrupt computationon any given core and run the policy payload on that machine. Thisrequires the AV engine to be able to issue a non-maskableinter-processor interrupt. Optionally, in some embodiments, the AVengine can communicate to the OS or supervisory software that it hasdetected a suspect process so that the system can start migrating otherco-resident sensitive computation. In some embodiments, the AV enginemay also be configured to run in the highest privilege mode.

Additionally, as noted, in some embodiments, the AV engine 210 may beconfigured to be updated with new malware signatures as they becomeavailable, or when new classification techniques are implemented. The AVupdate should be implemented in a way to prevent attackers fromcompromising the AV. For instance, a rogue user should not be able tomute the AV engine or subvert the AV engine to create a persistent,high-privilege rootkit.

Generally, security updates may include one or more of, for example, aclassifier, an action program that specifies security policies, aconfiguration file that determines which performance features are to beused with what classifiers, micro-architectural data for maliciousand/or non-malicious processes, and/or an update revision number. Thisdata can be delivered to the AV engine securely usingtechniques/procedures adapted for a hardware setting. A schematicdiagram of an example security update payload 800 that is to be sentfrom a system security vendor, including the various encryption levelsapplied to the payload, is depicted in FIG. 8A. An example procedure850, generally performed by an AV engine, to receive a security updatepayload (such as the encrypted payload 800) and update the configurationof the AV engine, is depicted FIG. 8B. As shown in the figure, theprocedure 850 includes receiving 855 the payload, and decrypting 860 thepayload with a “verif” key embedded in the hardware (on which the AVengine is implemented). A determination is then made 865 of whether aresulting hash of the “verif” matches the expected hash of the verif keyembedded in the hardware. If it doesn't, the procedure 850 terminates870. If there is a match of the hash of the “verif” key, a determinationis made 875 of the integrity of the payload with a SHA-2 hash function.If the integrity is confirmed, the payload is decrypted 885 with an AESkey (otherwise, the procedure terminates 880), and upon a determinationthat the update revision number indicated in the payload is in agreementwith a revision number indicator maintained in the hardware device (at890), the updates in the payload are applied 895.

As indicated in relation to the operation 890 of the procedure 850, insome embodiments, the hardware device on which the AV engine is, atleast partly, implemented, maintains the revision number of the lastupdate, and that revision number is incremented on every update. This isto prevent/inhibit an attacker from rolling back the AV system, which anattacker might do to prevent the system from discovering new maliciousprocesses. The AV engine may offer this protection by rejecting updateswith a revision number that is older than the revision number maintainedin the hardware counter.

With reference now to FIG. 3, a flowchart of an example procedure 300 todetect malicious processes is shown. The procedure 300 includesobtaining 310 hardware-based micro-architectural data, includinghardware-based micro-architectural counter data, for a hardware deviceexecuting one or more processes. As noted, in some embodiments,obtaining the micro-architectural data may be performed by a samplingunit, which may be implemented, at least partly, in hardware as part ofthe hardware-device that is to be monitored (i.e., the hardware deviceexecuting the one or more processes with respect to which themicro-architectural data is to be collected). In some embodiments, themicro-architectural data may be obtained periodically at regular orirregular intervals (e.g., at intervals of length determined by a pseudorandom process), and may be obtained through a data-pull (e.g., by thesampling unit initiating the collection of the micro-architectural data,with or without interruption the hardware device being observed) orthrough a data push process (e.g., the hardware device initiatingperiodic communication of micro-architectural data to an AV engine).

Based, at least in part, on the obtained hardware-basedmicro-architectural data, a determination is made 320 whether at leastone of the one or more processes executing on the hardware devicecorresponds to a malicious process. In some embodiments, a more specificdetermination may be made of the type or identity of the at least one ofthe one or more processes executing on the hardware device. As noted,determination of whether at least one process is malicious and/or thetype and/or identity of the at least one of the one or more processesexecuting on the hardware device may be performed using a machinelearning system that may include one or more classifiers (such as theone or more classifiers 216 a-n) that were trained with training dataincluding micro-architectural data for variants of known malicious andnon-malicious processes.

Thus, when a variant of a known malicious process executes on thehardware device to be monitored, even in situations where the exactimplementation of the malicious process has been modified, the maliciousprocess will generally perform operations that are characteristic ofthat process (e.g., accessing particular modules, retrieving specifictypes of data, etc.). These operations that are characteristic of theknown malicious process will result in a micro-architectural datasignature (that may be represented as a time series), which may then beidentified (or at least identified as being malicious or non-malicious)through at least one of the one or more classifiers of the machinelearning system of the AV engine.

With reference to FIG. 4, an example system 400 in which an AV engine(such as the AV engine 210 of FIG. 2) is implemented, is shown. Thesystem 400 includes a hardware device such as controller device 410,which may be a processor-based personal computer, a specializedcomputing device, and so forth, and which includes, in someimplementations, a processor-based unit such as central processor unit(CPU) 412. In some embodiments, the controller device 410 may berealized, at least in part, using modules such as an FPGA (fieldprogrammable gate array), an ASIC (application-specific integratedcircuit), a DSP processor, etc.

As noted, in some embodiments, at least part of the AV engine may beimplemented in hardware directly on the hardware device that is to bemonitored, and/or may be implemented in software executing on adedicated and secure controller device. For example, as depicted in FIG.4, the CPU 412 may be a multi-core processor, and the hardware portionof the AV engine may thus be realized on one or more of the cores 413 ofthe CPU 412, and be configured (e.g., through pre- or post-manufacturingprogramming) to perform one or more of the functions of the AV engine(e.g., collect micro-architectural data). If the hardware device to bemonitored is an application-specific controller device (e.g.,implemented as an application-specific integrated circuit), thehardware-portion of the AV may be realized at the time of manufacturingof the controller, e.g., as a special-purpose malware detection unitsthat sit on a network-on-chip, on-chip/off-chip FPGA, or off-chip ASICco-processor. These choices represent different trade-offs in terms offlexibility and area- and energy-efficiency. Moving security protectionto the hardware level solves several problems and provides someinteresting opportunities. For example, it ensures that the securitysystem cannot be disabled by software, even if the kernel iscompromised. Second, because the security system runs beneath theoperating system, the security system might be able to protect againstkernel exploits and other attacks against the kernel. Third, because thehardware itself is being modified (to accommodate at least some portionsof the AV engine), arbitrary static and dynamic monitoring capabilitiescan be added. This gives the security system extensive viewingcapabilities into software behavior.

As further shown in FIG. 4, in addition to the CPU 412 and/or otherapplication-specific hardware to implement controller functionality, thesystem 400 includes main memory, cache memory and bus interface circuits(not shown in FIG. 4). For example, the controller device 410 mayinclude a mass storage element 414, such as a hard drive or flash driveassociated with the system. The computing system 400 may further includea keyboard, or keypad, or some other user input interface 416, and amonitor 420, e.g., a CRT (cathode ray tube), LCD (liquid crystaldisplay) monitor, etc., that may be placed where a user can access them.

The controller device 410 is configured to facilitate, for example, theimplementation of operations to obtain hardware-basedmicro-architectural data resulting from execution of one or moreprocesses on the CPU 412 and/or on some other application-specificdevice on which processes are executing (or can be executed) anddetermine, based on the micro-architectural data obtained, whether atleast one of one or more of the processes executing on the controllerdevice 410 of the system 400 is a potentially malicious process (e.g.,malware). In some embodiments, identities of the one or more processesexecuting on the hardware of the controller device 410 may be determinedbased on the micro-architectural data collected. The storage device 414may thus include a computer program product that when executed on, forexample, a processor-based implementation of the controller device 410causes the device to perform operations to facilitate the implementationof procedures described, including the procedures to obtainmicro-architectural data and determine based on that data whether atleast one of the one or more executing processes is potentiallymalicious.

The controller device 410 may further include peripheral devices toenable input/output functionality. Such peripheral devices may include,for example, a CD-ROM drive and/or flash drive (e.g., a removable flashdrive), or a network connection (e.g., implemented using a USB portand/or a wireless transceiver), for downloading related content to theconnected system. Such peripheral devices may also be used fordownloading software containing computer instructions to enable generaloperation of the respective system/device. As noted, alternativelyand/or additionally, in some embodiments, special purpose logiccircuitry, e.g., an FPGA (field programmable gate array), an ASIC(application-specific integrated circuit), a DSP processor, etc., may beused in the implementation of the system 400. Other modules that may beincluded with the controller device 410 are speakers, a sound card, apointing device, e.g., a mouse or a trackball, by which the user canprovide input to the system 400. The controller device 410 may includean operating system, e.g., Windows XP® Microsoft Corporation operatingsystem, Ubuntu operating system, etc.

Computer programs (also known as programs, software, softwareapplications or code) include machine instructions for a programmableprocessor, and may be implemented in a high-level procedural and/orobject-oriented programming language, and/or in assembly/machinelanguage. As used herein, the term “machine-readable medium” refers toany non-transitory computer program product, apparatus and/or device(e.g., magnetic discs, optical disks, memory, Programmable Logic Devices(PLDs)) used to provide machine instructions and/or data to aprogrammable processor, including a non-transitory machine-readablemedium that receives machine instructions as a machine-readable signal.Non-transitory computer readable media can include media such asmagnetic media (such as hard disks, floppy disks, etc.), optical media(such as compact discs, digital video discs, Blu-ray discs, etc.),semiconductor media (such as flash memory, electrically programmableread only memory (EPROM), electrically erasable programmable read onlyMemory (EEPROM), etc.), any suitable media that is not fleeting or notdevoid of any semblance of permanence during transmission, and/or anysuitable tangible media.

Some or all of the subject matter described herein may be implemented ina computing system that includes a back-end component (e.g., as a dataserver), or that includes a middleware component (e.g., an applicationserver), or that includes a front-end component (e.g., a client computerhaving a graphical user interface or a Web browser through which a usermay interact with an embodiment of the subject matter described herein),or any combination of such back-end, middleware, or front-endcomponents. The components of the system may be interconnected by anyform or medium of digital data communication, e.g., a communicationnetwork. Examples of communication networks include a local area network(“LAN”), a wide area network (“WAN”), and the Internet.

The computing system may include clients and servers. A client andserver are generally remote from each other and typically interactthrough a communication network. The relationship of client and servergenerally arises by virtue of computer programs running on therespective computers and having a client-server relationship to eachother.

To evaluate the efficacy of the systems and procedures described hereinto detect/identify the existence of different types of maliciousprocesses, testing for different hardware device (e.g., differentprocessor architecture) and for different malicious processes (malware,side channel attack processes, etc.) was conducted.

The evaluations and testing performed included testing to determine theefficacy of the systems and procedures described herein to detectAndroid malware. Examples of Android malware include malware to createadvertisements and install unwanted links, cluttering up the user'sdevice, etc. More advanced malware may take advantage of a phone'sfeatures, e.g., making phone calls or sending text messages to premiumservices, resulting in charges on the user's cell phone bill. Other typeof Android malware may compromise a user's privacy in various ways,including accessing information like phone numbers, contact information,IMEI numbers and other sensitive data. Moreover, many Android-basedmobile devices have GPS capability, and therefore malware may be capableof physically tracking victims.

The systems and procedures described herein were applied to Androidmalware obtained from various sources that catalog or study malwares.The malware data sets acquired were divided into families of variants.In families with only one variant, different execution cycles were usedto acquire micro-architecture data for the malware specimen. Forfamilies with more than one variant, some of the variants were used fortraining purposes (e.g., generally about a ⅓ of the variants were usedfor training), while the remaining variants were used for testing (e.g.,to determine if the system and procedures described herein would detectthose variants). FIG. 5 is a table 500 of some of the Android malwarefamilies that were tested. Column 502 in the table 500, identified asAPKs, indicates the number of variants that were available for therespective malware families.

To train the classifiers that were used to process micro-architecturaldata, micro-architectural performance data was collected on all malwaresamples. In the evaluations and testing performed, the collectioninfrastructure operated at the thread level. In addition to data onmalware, data for 86 non-malware applications was also collected,resulting in data on 92,475 non-malware threads. These data sets wereused for both training and testing, with generally a ⅓ of the data setsused for training, and the rest of the data sets used for testing. Theperformance of the various classifiers that may be used to process themicro-architectural data can be adjusted using their respectiveconfiguration parameters. For instance, for a k-Nearest Neighbors (KNN)classifier, k is an adjustable configuration parameter. To identify anoptimal set of parameters, the classifier (or, in some situations,several classifiers) that is chosen is the one that identifies the mostmalware correctly. However, the amount of malware identified varies withfalse positive rate. As a classifier is configured to make it moresensitive, more malware is identified, but non-malicious, legitimateprocesses are then also identified as malware. To determine whichclassifier (or classifiers) to use, in some embodiments, theclassifier(s) that performs best (on the training data) for a givenfalse positive percentage may be selected.

FIG. 6 contains a graph 600 showing the accuracy of various binaryclassifiers in detecting Android malware. As illustrated in the graph600, as the false positives rate is increased, the classifiers find moremalware (column 504 of table 500 in FIG. 5 provides the rate forcorrectly identifying the executed processes for the various malwarefamilies on which the classifiers, such the decision tree classifier,were applied for a false positive rate of 10% or better). The resultsobtained for the Android malware testing indicate that the classifierstested work properly and that micro-architectural data (includingmicro-architectural performance counter data) can, with simple analysis,be used to detect Android malware with relatively good accuracy. Forexample, the “AnserverBot” malware had 187 known variants (which, asnoted, were obtained from third parties that study and categorizemalicious processes such as Android malware). Of those 187 knownvariants, 61 variants were used to train the classifiers of the systemsand procedures described herein. After being trained with those 61variants, the classifiers tested were able to identify 96.6% of thethreads of the remaining 126 variants.

Evaluations and testing to determine the efficacy of the systems andprocedures described herein was also performed on known Linux rootkits.Rootkits are malicious software that attackers install on compromisedsystems to evade detection and maximize their period of access on thesystems. Once installed, rootkits hide their presence in the systems,typically by modifying portions of the operating systems to obscurespecific processes, network ports, files, directories and session log-ontraces. With their stealth capabilities, they can pose a significantthreat to systems security due to the difficulty in detecting suchinfections.

In evaluating and testing the efficacy of the systems and procedures todetect Linux rootkit processes, two publicly available Linux rootkitswere used that, once loaded, gave an attacker the ability to hide log-onsession traces, network ports, processes, files and directories. The tworootkits used were:

-   -   1. Average Coder Rootkit—This rootkit works as a loadable kernel        module that hide traces via hooking the kernel file system        function calls. It is loaded into the kernel via the Linux        command insmod. It allows the attacker to modify the system        information to hide at runtime by writing via the echo command        to a pre-defined file/proc/buddyinfo.    -   2. Jynx2 Rootkit—This rootkit functions as a shared library and        is installed by configuring the LDPRELOAD environment variable        to reference this rootkit. When this is done, the rootkit is        executed as a shared library whenever any program runs. The        information it hides is pre-configured at compile-time and        cannot be modified once it is loaded.

The Linux operating system has native utility programs that producelistings of the system current state (such as current process listingand network ports). To evade detection, the rootkits are designed toobscure portions of the output of these programs. Therefore, it islikely that micro-architecture performance counter data for theseprograms (produced on the processor device on which the rootkitprocesses are executing) will show some degree of deviation after arootkit infection. To examine the presence of such deviation, collectionof per-process performance counter data focused on the followingprocesses

Program Relevant function ps List active running processes ls List filesand directories who List active log-on sessions netstat List activenetwork connections

Micro-architectural performance counter data was collected for variousarbitrarily selected event types (such as number of branchmisprediction, number of data TLB misses, number of L1 instruction cachereads) for multiple execution runs of all the programs. Two sets of datawere collected—one set was collected before the rootkits were installed(that collected set was referred to as the “clean set”), and the secondset was collected after the system was infected with the rootkits (thatset was referred to as the “dirty set”). To introduce variation to theexecution flows of the programs, each run of the programs was executedwith a random combination of their associated parameters. To do this, alist of command-lines comprising the program names combined with arandom set of valid parameters was generated. Each command-line was thenrandomly tagged as either clean or dirty to indicate the set of data itwould be used in. An example subset list of the command-lines isprovided below:

(clean) netstat -n (clean) netstat -nt (dirty) netstat -ntu . . .(dirty) ls -l /usr/include (clean) ls -ld /home (dirty) ls -lar/home/user (clean) ls -lart . . . / . . .

With the random list of command-lines generated, per-process per-runperformance data was collected. Additionally, to reduce input bias andto make the collected data more realistic, the action of various userslogging into the server and doing a series of tasks (like creating newfiles and running new processes) was simulated. Because both therootkits that were used have different stealth capabilities and targetthe outputs of different programs, dirty data was collected separatelyfor each rootkit. The collection of the data for each rootkit wasperformed with the following programs it was designed against:

Program Average Coder Jynx2 ps ✓ ls ✓ who ✓ netstat ✓

While the dirty data was being collected, the information hidden by therootkits was also varied. This included adding to, and removing from,the list of network ports, files, processes and log-on session logs thatwere hidden by the rootkits. As with the testing performed with respectto the Android malware, the micro-architectural data collected about therootkits was divided into testing and training sets, with ⅓ of the databeing used for training a large number of classifiers, and the remainingdata used for testing the trained classifiers. The classifiers weretrained to determine if the processes/programs for whichmicro-architectural data was being collected were running with orwithout rootkits (i.e., whether or not there was a rootkitcontamination). FIG. 7 includes graphs 700 showing the accuracy (interms of the number of correctly identified malicious threads as afunction of false-positive rate) of the classifiers used as part of AVengine implemented herein. Although the accuracy achieved by the systemsand procedures described herein for rootkit detection is generally lowerthan that achieved when the systems and procedures were applied to theAndroid malware, it is to be noted that because rootkits do not operateas separate programs, but rather are configured to dynamically interceptprograms' normal control flow, the training data used is affected by therootkits to a relatively small degree. As a result, identification ofrootkits is generally more difficult than identification of other typesof malicious processes.

Evaluation and testing of the systems and procedures described hereinwas also performed in relation to side-channel attacks. The termside-channel refers to unintended information leakage present in realimplementations of systems. Because specific implementations cannotadhere to the idealized axioms of theoretical models, side-channels canbe used to steal information from theoretically secure systems. Forexample, RSA cryptographic keys can be stolen by observing theperformance of the branch predictor, or of the caches, for most existingimplementations. Common side-channel mediums include acoustic orelectrical signals, power draw, application-level timing channels,architectural or micro-architectural affects, or, in general, any sharedresources. Although side-channel attacks are not generally consideredmalware, they render security of a hardware-based device vulnerable, andfurthermore, have characteristic micro-architectural behavior that maybe detected by the systems and procedures described herein.

A side-channel “attacker” process is a process that gets placed withinthe system in such a way that it shares a resource and uses thatresource to learn information. Micro-architectural examples includesharing a network card, a core pipeline, memory bandwidth and caches. Inembodiments involving side-channel attacks on a cache, shared on-chipcaches can leak tremendous amounts of data that can be readily used to,for example, steal cryptographic keys and/or other types of privatedata. Intuitively, attacker programs that exploit micro-architecturalside-channels should have clear signatures in terms of performance. Forexample, side-channel attack processes repeatedly thrash a particularshared resource so as to gauge all the activity of the victim processwith respect to that shared resource. Micro-architectural events andperformance counters are therefore likely to take on extreme valuesduring such attacks, and thus indicate that the occurrence of attackerprograms/processes (and possibly identify those attackerprograms/processes).

To test the efficacy of the systems and procedures described herein todetermine the occurrence of side-channel attacks (and/or identify thespecific side-channel attack processes), an array of cache side-channelattacks was implemented. Variants of the standard prime-and-probetechnique were implemented, in which an attacker program/process wroteto every line in the L1 data cache, and then scanned the cacherepeatedly (using a pattern chosen at compile time) to read every line.Whenever a miss occurred, it meant there was a conflict miss caused bythe victim process sharing the cache. The result data of a successfulprime-and-probe attack includes data about the cache lines used by thevictim process over time. The prime-and-probe variants were implementedand executed against an OpenSSL victim process. The cache side-channelattack processes were compared against a wide array of normal processes,which included programs of SPEC2006 int, SPEC2006 fp, PARSEC, webbrowsers, games, graphics editors and other common desktop applications,as well as generic system-level processes.

As with testing performed for the Android malware and Linux rootkits, ⅓of the micro-architectural data collected was used to train theclassifiers of the AV engine (namely, the KNN, DecisionTree, Tensor,RandomForest, and FANN classifiers). In this case, the training dataincluded 3872 normal program threads and 12 attack threads. The trainedclassifiers were used to analyze the remaining two thirds of thecollected data. The classifiers achieved perfect results when analyzingthe 7744 normal threads and 24 attacks threads of this example testing,detecting all 24 attack threads without producing any false positives.The results also indicated that in processing side-channel attackmicro-architectural data, it did not matter which particular classifierwas used.

Although particular embodiments have been disclosed herein in detail,this has been done by way of example for purposes of illustration only,and is not intended to be limiting with respect to the scope of theappended claims, which follow. Some other aspects, advantages, andmodifications are considered to be within the scope of the claimsprovided below. The claims presented are representative of at least someof the embodiments and features disclosed herein. Other unclaimedembodiments and features are also contemplated.

What is claimed is:
 1. A method comprising: obtaining hardware-basedmicro-architectural data representative of a trace ofmicro-architectural activities performed by a process executing on ahardware device over a time period, the hardware-basedmicro-architectural data including hardware-based microarchitecturalcounter data; and determining based on the hardware-basedmicro-architectural data representative of the micro-architecturalactivities performed by the process executing on the hardware deviceover the period of time, and on multiple data sets ofmicro-architectural data representative of respective traces ofactivities performed by different anomalous processes executable on thehardware device during respective time periods, whether the processexecuting on the hardware device corresponds to one of the differentanomalous processes.
 2. The method of claim 1, wherein thehardware-based micro-architectural data comprises hardware-basedtime-varying micro-architectural counter data, wherein the time-varyingmicro-architectural counter data measures events that occur on one ormore circuits of the hardware device, with the events being internal toone or more processors executing the process, the events being countedon one or more counters of the one or more processors, and the one ormore counters being configured to count the events.
 3. The method ofclaim 1, further comprising: in response to a determination that theprocess executing on the hardware device corresponds to one of thedifferent anomalous processes, performing one of: terminating theexecution of the process, shutting down the hardware device, reportingdetection of the one of the different anomalous processes to a user, ormigrating one or more of the process executing on the hardware device oranother sensitive computation executing on the hardware device toanother computation platform.
 4. The method of claim 1, whereindetermining whether the process executing on the hardware devicecorresponds to one of the different anomalous processes comprises:applying one or more machine-learning procedures, trained using themultiple data sets of micro-architectural data representative ofrespective traces of activities performed by the different anomalousprocesses executable on the hardware device during the respective timeperiods, on the hardware-based micro-architectural data to determinewhether the process corresponds to the one of the different anomalousprocesses.
 5. The method of claim 4, wherein the one or more machinelearning procedures comprise one or more of: a k-nearest neighborprocedure, a decision tree procedure, a random forest procedure, anartificial neural network procedure, a tensor density procedure, or ahidden Markov model procedure.
 6. The method of claim 1, whereindetermining whether the process executing on the hardware devicecorresponds to one of the different anomalous processes comprises:determining invariancy between the hardware-based micro-architecturaldata to the multiple data sets of micro-architectural datarepresentative of the respective traces of activities performed by thedifferent anomalous processes executable on the hardware device duringthe respective time periods to determine whether the process correspondsto the one of the different anomalous processes.
 7. The method of claim1, wherein determining whether the process executing on the hardwaredevice corresponds to one of the different anomalous processescomprises: performing one of a heuristic process or a non-heuristicprocess on the hardware-based microarchitectural data to determinewhether the process corresponds to the one of the different anomalousprocesses.
 8. The method of claim 1, further comprising: obtainingupdates for the multiple data sets of micro-architectural datarepresentative of the respective traces of activities performed by thedifferent anomalous processes.
 9. The method of claim 8, whereinobtaining the updates comprises: downloading encrypted data for themultiple data sets of micro-architectural data representative of therespective traces of activities performed by the different anomalousprocesses to an antivirus engine in communication with the hardwaredevice providing the hardware-based micro-architectural data; decryptingat the antivirus engine the downloaded encrypted data for the multipledata sets of micro-architectural data representative of respectivetraces of activities performed by the different anomalous processes; andupdating a revision counter maintained by the antivirus engineindicating a revision number of a most recent update of the multipledata sets of micro-architectural data representative of the respectivetraces.
 10. The method of claim 1, wherein obtaining the hardware-basedmicro-architectural data comprises: obtaining the hardware-basedmicro-architectural data at various time instances.
 11. The method ofclaim 10, wherein obtaining the hardware-based micro-architectural dataat the various time instances comprises: performing one or more of adata push operation initiated by the hardware device to send themicro-architectural data, or a data pull operation, initiated by anantivirus engine, to send the micro-architectural data.
 12. The methodof claim 1, wherein obtaining the hardware-based micro-architecturaldata comprises: obtaining multi-core hardware-based micro-architecturaldata resulting from execution of the process on a processor device withmultiple processor cores executing multiple processes; and correlatingthe respective hardware-based micro-architectural data obtained fromeach of the multiple processor cores to the multiple processes.
 13. Themethod of claim 1, wherein the one of the different anomalous processescomprises one or more of: a malware process, or a side-channel attackprocess.
 14. A system for detection of anomalous program execution, thesystem comprising: a hardware device executing one or more processes;and an antivirus engine in communication with the hardware device, theantivirus engine configured to: obtain hardware-basedmicro-architectural data representative of a trace ofmicro-architectural activities performed by a process, from the one ofthe one or more processes executing on the hardware device, over a timeperiod, the hardware-based micro-architectural data includinghardware-based microarchitectural counter data; and determine based onthe hardware-based micro-architectural data representative of themicro-architectural activities performed by the process executing on thehardware device over the period of time, and on multiple data sets ofmicro-architectural data representative of respective traces ofactivities performed by different anomalous processes executable on thehardware device during respective time periods, whether the processexecuting on the hardware device corresponds to one of the differentanomalous processes.
 15. The system of claim 14, wherein thehardware-based micro-architectural data comprises hardware-basedtime-varying micro-architectural counter data, wherein the time-varyingmicro-architectural counter data measures events that occur on one ormore circuits of the hardware device, with the events being internal toone or more processors executing the process, the events being countedon one or more counters of the one or more processors, and the one ormore counters being configured to count the events.
 16. The system ofclaim 14, wherein the antivirus engine configured to determine whetherthe process executing on the hardware device corresponds to one of thedifferent anomalous processes is configured to: apply one or moremachine-learning procedures, trained using the multiple data sets ofmicro-architectural data representative of respective traces ofactivities performed by the different anomalous processes executable onthe hardware device during the respective time periods, on thehardware-based micro-architectural data to determine whether the processcorresponds to the one of the different anomalous processes.
 17. Thesystem of claim 16, wherein the one or more machine learning procedurescomprise one or more of: a k-nearest neighbor procedure, a decision treeprocedure, a random forest procedure, an artificial neural networkprocedure, a tensor density procedure, or a hidden Markov modelprocedure.
 18. The system of claim 14, wherein the antivirus engineconfigured to determine whether the process executing on the hardwaredevice corresponds to one of the different anomalous processes isconfigured to perform one or more of: determine invariancy between thehardware-based micro-architectural data to the multiple data sets ofmicro-architectural data representative of the respective traces ofactivities performed by the different anomalous processes executable onthe hardware device during the respective time periods to determinewhether the process corresponds to the one of the different anomalousprocesses; or perform one of a heuristic process or a non-heuristicprocess on the hardware-based microarchitectural data to determinewhether the process corresponds to the one of the different anomalousprocesses.
 19. The system of claim 14, wherein the antivirus engine isimplemented at one of: a module of the hardware device, or a remotesystem different from the hardware device.
 20. A non-transitory computerreadable media comprising computer instructions executable on aprogrammable device to: obtain hardware-based micro-architectural datarepresentative of a trace of micro-architectural activities performed bya process executing on a hardware device over a time period, thehardware-based micro-architectural data including hardware-basedmicroarchitectural counter data; and determine based on thehardware-based micro-architectural data representative of themicro-architectural activities performed by the process executing on thehardware device over the period of time, and on multiple data sets ofmicro-architectural data representative of respective traces ofactivities performed by different anomalous processes executable on thehardware device during respective time periods, whether the processexecuting on the hardware device corresponds to one of the differentanomalous processes.